Directed Graph Model for Mongolian Lexical Analysis

نویسنده

  • JIANG Wenbin
چکیده

We propo se a gener ativ e stat istical model for Mongo lian lexical ana lysis. This model describes the lex ical analysis r esult as a dir ected g raph, wher e t he nodes r epr esent the stems, affixes and their tag s, while the edges represent the t ransitio n o r generation relationships between nodes. Especially in this wo rk, we adopt three kinds o f transitio n o r g eneration probabilities: a) pr obabilit ies of stem-stem tr ansition, affixaffix transitio n and stem-affix generation; b) t he transition o r generation probabilities between the co rr esponding tags; and c) the gener ation probabilities between stems o r affixes and their tags. U sing the 3rdlevel annotated corpus wit h about 200 000 w ords as the t raining data, this model achieves a wo rdlevel segmentation accuracy of 95. 1% , and a wo rdlevel jo int segmentat ion and tagging accuracy of 93% .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis on Lexical Errors in Writings of Mongolian English Majors

The purpose of the study is to examine types of lexical errors committed by EFL Mongolian learners in their writing. A total of 525 errors in 62 English writings by Mongolian English majors were identified and analyzed. Supplementary information is also collected by means of questionnaire and interviews for more comprehensive understanding of the factors affecting the errors. Possible solutions...

متن کامل

HMM and CRF Based Hybrid Model for Chinese Lexical Analysis

This paper presents the Chinese lexical analysis systems developed by Natural Language Processing Laboratory at Dalian University of Technology, which were evaluated in the 4th International Chinese Language Processing Bakeoff. The HMM and CRF hybrid model, which combines character-based model with word-based model in a directed graph, is adopted in system developing. Both the closed and open t...

متن کامل

Directed prime graph of non-commutative ring

Prime graph of a ring R is a graph whose vertex set is the whole set R any any two elements $x$ and $y$ of $R$ are adjacent in the graph if and only if $xRy = 0$ or $yRx = 0$.  Prime graph of a ring is denoted by $PG(R)$. Directed prime graphs for non-commutative rings and connectivity in the graph are studied in the present paper. The diameter and girth of this graph are also studied in the pa...

متن کامل

A method for analyzing the problem of determining the maximum common fragments of temporal directed tree, that do not change with time

In this study two actual types of problems are considered and solved: 1) determining the maximum common connected fragment of the T-tree (T-directed tree) which does not change with time; 2) determining all maximum common connected fragments of the T-tree (T-directed tree) which do not change with time. The choice of the primary study of temporal directed trees and trees is justified by the wid...

متن کامل

Structural Properties Of Lexical Systems: Monolingual And Multilingual Perspectives

We introduce a new type of lexical structure called lexical system , an interoperable model that can feed both monolingual and multilingual language resources. We begin with a formal characterization of lexical systems as “pure” directed graphs, solely made up of nodes corresponding to lexical entities and links. To illustrate our approach, we present data borrowed from a lexical system that ha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011